9 research outputs found
A Robust Real-Time Automatic License Plate Recognition Based on the YOLO Detector
Automatic License Plate Recognition (ALPR) has been a frequent topic of
research due to many practical applications. However, many of the current
solutions are still not robust in real-world situations, commonly depending on
many constraints. This paper presents a robust and efficient ALPR system based
on the state-of-the-art YOLO object detector. The Convolutional Neural Networks
(CNNs) are trained and fine-tuned for each ALPR stage so that they are robust
under different conditions (e.g., variations in camera, lighting, and
background). Specially for character segmentation and recognition, we design a
two-stage approach employing simple data augmentation tricks such as inverted
License Plates (LPs) and flipped characters. The resulting ALPR approach
achieved impressive results in two datasets. First, in the SSIG dataset,
composed of 2,000 frames from 101 vehicle videos, our system achieved a
recognition rate of 93.53% and 47 Frames Per Second (FPS), performing better
than both Sighthound and OpenALPR commercial systems (89.80% and 93.03%,
respectively) and considerably outperforming previous results (81.80%). Second,
targeting a more realistic scenario, we introduce a larger public dataset,
called UFPR-ALPR dataset, designed to ALPR. This dataset contains 150 videos
and 4,500 frames captured when both camera and vehicles are moving and also
contains different types of vehicles (cars, motorcycles, buses and trucks). In
our proposed dataset, the trial versions of commercial systems achieved
recognition rates below 70%. On the other hand, our system performed better,
with recognition rate of 78.33% and 35 FPS.Comment: Accepted for presentation at the International Joint Conference on
Neural Networks (IJCNN) 201
Leveraging Model Fusion for Improved License Plate Recognition
License Plate Recognition (LPR) plays a critical role in various
applications, such as toll collection, parking management, and traffic law
enforcement. Although LPR has witnessed significant advancements through the
development of deep learning, there has been a noticeable lack of studies
exploring the potential improvements in results by fusing the outputs from
multiple recognition models. This research aims to fill this gap by
investigating the combination of up to 12 different models using
straightforward approaches, such as selecting the most confident prediction or
employing majority vote-based strategies. Our experiments encompass a wide
range of datasets, revealing substantial benefits of fusion approaches in both
intra- and cross-dataset setups. Essentially, fusing multiple models reduces
considerably the likelihood of obtaining subpar performance on a particular
dataset/scenario. We also found that combining models based on their speed is
an appealing approach. Specifically, for applications where the recognition
task can tolerate some additional time, though not excessively, an effective
strategy is to combine 4-6 models. These models may not be the most accurate
individually, but their fusion strikes an optimal balance between accuracy and
speed.Comment: Accepted for presentation at the Iberoamerican Congress on Pattern
Recognition (CIARP) 202
A Benchmark for Iris Location and a Deep Learning Detector Evaluation
The iris is considered as the biometric trait with the highest unique
probability. The iris location is an important task for biometrics systems,
affecting directly the results obtained in specific applications such as iris
recognition, spoofing and contact lenses detection, among others. This work
defines the iris location problem as the delimitation of the smallest squared
window that encompasses the iris region. In order to build a benchmark for iris
location we annotate (iris squared bounding boxes) four databases from
different biometric applications and make them publicly available to the
community. Besides these 4 annotated databases, we include 2 others from the
literature. We perform experiments on these six databases, five obtained with
near infra-red sensors and one with visible light sensor. We compare the
classical and outstanding Daugman iris location approach with two window based
detectors: 1) a sliding window detector based on features from Histogram of
Oriented Gradients (HOG) and a linear Support Vector Machines (SVM) classifier;
2) a deep learning based detector fine-tuned from YOLO object detector.
Experimental results showed that the deep learning based detector outperforms
the other ones in terms of accuracy and runtime (GPUs version) and should be
chosen whenever possible.Comment: Accepted for presentation at the International Joint Conference on
Neural Networks (IJCNN) 201
UFPR-Periocular: A Periocular Dataset Collected by Mobile Devices in Unconstrained Scenarios
Recently, ocular biometrics in unconstrained environments using images
obtained at visible wavelength have gained the researchers' attention,
especially with images captured by mobile devices. Periocular recognition has
been demonstrated to be an alternative when the iris trait is not available due
to occlusions or low image resolution. However, the periocular trait does not
have the high uniqueness presented in the iris trait. Thus, the use of datasets
containing many subjects is essential to assess biometric systems' capacity to
extract discriminating information from the periocular region. Also, to address
the within-class variability caused by lighting and attributes in the
periocular region, it is of paramount importance to use datasets with images of
the same subject captured in distinct sessions. As the datasets available in
the literature do not present all these factors, in this work, we present a new
periocular dataset containing samples from 1,122 subjects, acquired in 3
sessions by 196 different mobile devices. The images were captured under
unconstrained environments with just a single instruction to the participants:
to place their eyes on a region of interest. We also performed an extensive
benchmark with several Convolutional Neural Network (CNN) architectures and
models that have been employed in state-of-the-art approaches based on
Multi-class Classification, Multitask Learning, Pairwise Filters Network, and
Siamese Network. The results achieved in the closed- and open-world protocol,
considering the identification and verification tasks, show that this area
still needs research and development
An Efficient and Layout-Independent Automatic License Plate Recognition System Based on the YOLO detector
This paper presents an efficient and layout-independent Automatic License
Plate Recognition (ALPR) system based on the state-of-the-art YOLO object
detector that contains a unified approach for license plate (LP) detection and
layout classification to improve the recognition results using post-processing
rules. The system is conceived by evaluating and optimizing different models,
aiming at achieving the best speed/accuracy trade-off at each stage. The
networks are trained using images from several datasets, with the addition of
various data augmentation techniques, so that they are robust under different
conditions. The proposed system achieved an average end-to-end recognition rate
of 96.9% across eight public datasets (from five different regions) used in the
experiments, outperforming both previous works and commercial systems in the
ChineseLP, OpenALPR-EU, SSIG-SegPlate and UFPR-ALPR datasets. In the other
datasets, the proposed approach achieved competitive results to those attained
by the baselines. Our system also achieved impressive frames per second (FPS)
rates on a high-end GPU, being able to perform in real time even when there are
four vehicles in the scene. An additional contribution is that we manually
labeled 38,351 bounding boxes on 6,239 images from public datasets and made the
annotations publicly available to the research community